Break it Down for Me: A Study in Automated Lyric Annotation
نویسندگان
چکیده
Comprehending lyrics, as found in songs and poems, can pose a challenge to human and machine readers alike. This motivates the need for systems that can understand the ambiguity and jargon found in such creative texts, and provide commentary to aid readers in reaching the correct interpretation. We introduce the task of automated lyric annotation (ALA). Like text simplification, a goal of ALA is to rephrase the original text in a more easily understandable manner. However, in ALA the system must often include additional information to clarify niche terminology and abstract concepts. To stimulate research on this task, we release a large collection of crowdsourced annotations for song lyrics. We analyze the performance of translation and retrieval models on this task, measuring performance with both automated and human evaluation. We find that each model captures a unique type of information important to the task.
منابع مشابه
From Lyrics to Their Interpretations: Automated Reading between the Lines
Although musical metadata, such as genre, mood, and usage, have been widely studied, there is a less explored type of metadata, the subject of the lyric, which might be potentially useful in music digital libraries. In my previous research, I have explored the use of the subject metadata in the context of music classification based on it. The biggest challenge of the classification task is that...
متن کاملFuzzy Neighbor Voting for Automatic Image Annotation
With quick development of digital images and the availability of imaging tools, massive amounts of images are created. Therefore, efficient management and suitable retrieval, especially by computers, is one of themost challenging fields in image processing. Automatic image annotation (AIA) or refers to attaching words, keywords or comments to an image or to a selected part of it. In this paper,...
متن کاملThe Effects of Multimedia Annotations on Iranian EFL Learners’ L2 Vocabulary Learning
In our modern technological world, Computer-Assisted Language learning (CALL) is a new realm towards learning a language in general, and learning L2 vocabulary in particular. It is assumed that the use of multimedia annotations promotes language learners’ vocabulary acquisition. Therefore, this study set out to investigate the effects of different multimedia annotations (still picture annotatio...
متن کاملAn annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies
A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...
متن کاملDeciding on Units of Analysis within Centering Theory
Many efforts in corpora annotation start with segmenting discourse into units of analysis. In this paper, we present a method for deciding on segmentation units within Centering Theory (Grosz, Joshi, & Weinstein, 1995). We survey the different existing methods to break down discourse into utterances and discuss the results of a comparison study among them. The contribution of our study is that ...
متن کامل